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Abstract —Inferring causality using longitudinal observational 
databases is challenging due to the passive way the data are 
collected. The majority of associations found within longitudi¬ 
nal observational data are often non-causal and occur due to 
confounding. 

The focus of this paper is to investigate incorporating infor¬ 
mation from additional databases to complement the longitudinal 
observational database analysis. We investigate the detection of 
prescription drug side effects as this is an example of a causal 
relationship. In previous work a framework was proposed for 
detecting side effects only using longitudinal data. In this paper 
we combine a measure of association derived from mining a 
spontaneous reporting system database to previously proposed 
analysis that extracts domain expertise features for causal anal¬ 
ysis of a UK general practice longitudinal database. 

The results show that there is a significant improvement to 
the performance of detecting prescription drug side effects when 
the longitudinal observation data analysis is complemented by 
incorporating additional drug safety sources into the framework. 
The area under the receiver operating characteristic curve (AUC) 
for correctly classifying a side effect when other data were 
considered was 0.967, whereas without it the AUC was 0.923 
However, the results of this paper may be biased by the evaluation 
and future work should overcome this by developing an unbiased 
reference set. 


I. Introduction 

The current gold standard methodology for inferring 
causality between drugs and health outcomes is to conduct a 
randomised clinical trial m . Methods have been developed for 
identifying associations between drugs and health outcomes 
using longitudinal observational data but due to the passive 
way that data are collected, confounding is a common occur¬ 
rence 12. Confounding is when an association between two 
variables is identified but the association is caused by a third 
unobserved variable being associated to both of the variables. 
Due to the problem of confounding, relationships between 
drugs and health outcomes that are detected in longitudinal 
observational databases often require further analysis before 
causality is confirmed. This additional analysis is often in 
the form of experimentation via randomised trails. This is 
costly, sometimes unethical and cannot always be implemented 
0. This issue has motivated an active field of research 
into methods that can identify causal relationships without 
requiring additional experimentation. 


In previously work, researchers have investigated using 
more advanced supervised data mining methods to identify 
causality in longitudinal observational databases. Examples 
include creating constrained Bayesian networks m or creating 
features based on domain expertise in causal inference 0. In 
the later work, the authors proposed generating attributes based 
on the nine Bradford Hill causality considerations 0 that 
are often used by epidemiologists when manually determining 
causality between drugs and health outcomes. Training a 
classifier to distinguish between causal and non-causal rela¬ 
tionships using five of the Bradford Hill causality consideration 
proposed attributes lead to a lower false positive rate that 
previously obtained using unsupervised methods 0 and was 
suitable for causal inference with big data. Unfortunately the 
false positive rate was still higher than desired, motivating 
further development of the idea by incorporating more of the 
Bradford Hill causality considerations. In this paper we in¬ 
vestigate incorporating the consistency consideration from the 
Bradford Hill causality considerations and determine whether 
adding this consideration improves the classification. 

The consistency consideration referred to whether an as¬ 
sociation is found consistently across diverse and disperse 
sources of data. If a drug truly causes a specific health 
outcome, then the association between the drug and health 
outcome should be found in different sources of data. When 
an association is only found in one data source, then there 
is a good chance that it may just have occurred by chance 
or due to some form of bias in that way the data were 
collected. To incorporate the consistency consideration into 
the causal inference model perviously developed we calculate 
a measure of association using the USA’s Food and Drug 
Administrations Adverse Event Reporting System (FAERS) 
data 0 to complement the analysis applied to a UK general 
practice database known as The Health Improvement Network 
(THIN) database (www.thin-uk.com) 0. 


The continuation of this paper is as follows. In section 
\U\ we discuss the importance of incorporating expert domain 
knowledge for successful data mining and describe the existing 
causal inference method based on the Bradford Hill consider¬ 
ations. In section III we describe the data used throughout this 
paper and the various measures used to evaluate the causal 
inference method. This is followed by the new framework that 
incorporates the consistently consideration in section IV In 
section [V] we present the results of the analysis on a reference 




set and discuss these results. The paper concluded with section 

EH 

II. Background 

There is debate about whether it is domain expertise or 
machine learning skills that are the most important factor for 
successful data mining. It is a generally accepted that domain 
expertise is important in all aspects of the knowledge discovery 
process (9J- Making use of domain expertise to understand the 
problem enables the data miner to extract suitable features 
and pre-process the data in a way that enables classifiers to 
distinguish between classes. With well-designed and relevant 
features, it is possible that the classes are separable in the 
feature space. In this situation, any classifier should perform 
reasonably well. However, if the features are unsuitable then 
the majority of classifiers will perform poorly and advanced 
techniques are required. Therefore, whenever possible, it is 
important to incorporate domain expertise into the feature 
extraction to simplify the classification task. 

In 0 the authors incorporate causal inference domain 
expertise to extract features that could be used as input into 
training a classifier to identifying causal relationships between 
drugs and health outcomes. The features were extracted based 
on Bradford Hill’s causality considerations 0- These are a set 
of nine considerations that are often used to identify a causal 
relationship such as a drug’s side effects. The considerations 
are: 

i) Association strength: A measure of dependancy 
between the drug and health outcome. 

ii) Temporality: Does the drug occur before the 
health outcome or the health outcome before the 
drug? 

iii) Specificity: Is the drug only associated to one 
health outcome and the health outcome only as¬ 
sociated to one drug? 

iv) Consistency: Is there evidence of the association 
in difference sources of data? 

v) Biological gradient: Is there a correlation between 
the dosage of the drug and the occurrence of the 
health outcome? 

vi) Experimentation: Does stoping the drug stop the 
health outcome and restarting the drug restart the 
health outcome? 

vii) Coherence: Does the drug causing the health 
outcome make sense or would it contradict known 
knowledge? 

viii) Plausibility: Is the health outcome a possible side 
effect of the drug (e.g. is there knowledge that 
the chemical structure may interact with some 
biological pathway to cause the health outcome)? 

ix) Analogy: Is a similar drug know to cause the 
health outcome or the drug known to cause a 
similar health outcome? 

In previous work, the classifier was trained to predict 
whether a drug and health outcome pair correspond to an 
adverse drug reaction based on the extraction of their features 
from the longitudinal observational data. The extracted features 
corresponded to the drug and health outcome relationship’s as¬ 
sociation strength, temporality, specificity, biological gradient 


and experimentation. This framework considering these five 
Bradford Hill considerations resulted in AUC values ranging 
between 0.883-937 0 . The analogy consideration was not 
used to create features, but was indirectly incorporated by 
applying a supervised learning technique. The knowledge 
of drug and health outcomes that are known to correspond 
to adverse drug reactions or non-adverse drug reactions are 
utilised by the classifier to enable it to learn to predict whether 
a drug and health outcome pair correspond to an adverse 
drug reaction based on their extracted Bradford Hill derived 
features. 

The classifier performed well and it was shown that includ¬ 
ing features based on Bradford Hill’s specificity, biological 
gradient and experimentation considerations rather than just 
association strength and temporality significantly improved the 
ability to identify adverse drug reactions. Unfortunately, due to 
restricting the analysis to a single database in previous work, it 
was not possible to extract features based on the consistency 
consideration. The plausibility and coherence considerations 
were also not previously used as these require expert knowl¬ 
edge about the chemical structure of the drug and known 
biological pathway interactions. However, the plausibility and 
coherence considerations could be included in future work by 
incorporating chemical structure data. 

In this work we propose a way of combining the sponta¬ 
neous reporting system databases with the longitudinal obser¬ 
vational database analysis and can therefore create features 
corresponding to the consistency consideration. It is of in¬ 
terest to determine whether including a different data source 
can improve the framework’s adverse drug reaction detecting 
performance. The FAERS database is partitioned by year and 
quarter. It would be possible to extract a measure of association 
for each drug and health outcome within the FAERS for each 
year from 2010 to 2013. A drug and health outcome with 
a strong association in the THIN data and an association that 
occurs frequently across the FAERS records would be evidence 
of the drug and health outcome corresponding to an adverse 
drug reaction. 

III. Materials 

A. THIN 

The THIN database is a longitudinal observational database 
containing general practice data from the UK. The data 
are extracted directly from the local databases of the 587 
participating general practices and are then validated and 
anonymised. The complete database contains over 3.6 million 
active patient and over 12 million patients in total. For each 
patient their year of birth and gender are recorded. There is also 
additional demographic data often recorded. While patients are 
registered at the general practice and it is participating, any 
medical events (e.g., diagnosis, symptom, laboratory test or ad¬ 
ministration event) that the patient informs the general practice 
of is recorded into a medical table with a corresponding date 
of recording. Any drugs that are prescribed during this period 
are recorded into a therapy table along with the date of the 
prescription. The THIN database contains over 750 million 
medical records and over 1 billion therapy records. Screen 
shots of the therapy, patient and medical tables contained in 
THIN are displayed in Fig. |T| - Fig. [3] 


Fig. 3. A screen shot of the THIN medical table 
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Fig. 1. A screen shot of the THIN therapy table 



The medical events are recorded via a clinical encoding 
consisting of 5 alphanumerics/dot characters known as a 
READ code [TO). Each READ code is linked to a description 
string detailing the medical event. The level of a READ code 
x = X 1 X 2 X 3 X 4 X 5 is defined as L(x) = max{i : Xi 7 ^ 

The READ codes have a hierarchal structure with child 
READ codes corresponding to the same medical event as 
their parents but with more detail, see Fig. [4] A READ code, 
x = xia; 2 a; 3 a; 4 a; 5 , is the parent of another READ code, 
V = 2 / 12 / 22 / 32 / 42/5 if the level of READ code x is one less then 
the level of READ code y and Xi = yi,Vi £ N < L{x). For 
example, the READ code ‘A....’ corresponds to the description 
‘Infection’ and is the parent of the READ code ‘Al...’ corre¬ 
sponding to ‘Tuberculosis’, which is the parent of the READ 
code ‘All..’ corresponding to ‘Pulmonary tuberculosis’. The 
drug prescriptions are recorded into the THIN database via a 
multilexeid code. The multilexeid code has a corresponding 
string detailing the drug’s generic name and dosage. 

In this paper we use a subset of the THIN database. The 
subset consists of approximately half of the patients within the 
whole database but contains the complete medical and therapy 
records for these patients. A subset of the THIN database is 
used in this research as this enables us to develop novel ana¬ 
lytical techniques that will later be evaluated on the remaining 
THIN data. The potential adverse drug reactions identified 
during the research on the first half of the THIN database 
can be evaluated with standard epidemiological analysis on 
the second half of the database. 

There are some issues with the THIN database that can bias 
analysis. One known problem is that patients can register at a 
new general practice at any point in time. This can cause issues 


Fig. 4. An example of the hierarchical structure of the READ codes 



with the recording of their medical events, as it is common for 
newly registered patients to inform their new doctor of existing 
illnesses. Due to them being at a new practice, the doctor will 
record these existing illnesses but the date will be the date 
they informed the doctor of these illnesses rather than the date 
that the illness first occurred. Previous research has shown that 
the probability of patients informing their doctors of existing 
illnesses is reduced after being at the practice for 12 months 
im. Therefore, we ignore the first 12 months of data for a 
newly registered patient. 

B. FAERS 

The FAERS is a spontaneous reporting system (SRS) 
database collect in the USA, see Fig. [5] for the database 
structure of the FAERS. SRS databases contain records of 
suspected adverse drug reactions. Medical health practitioners 
or the consumers, such as patients, can submit a record 
in a spontaneous reporting system if they expect they have 
witnessed or experienced an adverse drug reaction. The records 
therefore contain a link between a drug or set of drugs and a 
medical event. The data are stored for each year and quarter. 
In this paper we used the FAERS data from 2010 Q1 - 2013 
Q4. We combined Q1-Q4 reports each year, so we had four 
datasets, the reports recorded in years 2010 , 2011 , 2012 and 
2013. 

The FAERS data contain seven tables: 

• Therapy- contains the start and end day of the pre¬ 
scription 

• Drug - contains drug name and dosage information 














































Fig. 5. The structure of the old FAERS database from Il2l . The ISR has 
now been replaced by the primaryid and caseid 
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• Reaction - contains the suspected adverse event 

• Outcome - contains the outcome of the suspected 
adverse drug reaction 

• Demographics - contains details about the patient 

• Indication - contains the cause of the patient taking 
the prescription 

• RPSR - contains information about the person submit¬ 
ting the report 

The drug table contains details of the drug suspected to 
have caused an adverse drug reaction. The details include the 
drug’s generic name in upper case, the drug dosage information 
and the role of the drug within the report (e.g. is it a primary 
suspect or concomitant). The health outcome suspected to 
have been caused by an adverse drug reaction is recorded 
into the reaction table. The column ISR, corresponding to 
independent safety report, historically linked the drug and 
reaction table records, however, in more recent files this has 
been replaced by caseid and primaryid. Within the reaction 
table, the health outcome is recorded via a string detailing the 
health outcome. The string comes from a coding system known 
as the Medical Dictionary for Regulatory Activities (MedDRA) 
ED- This coding system was developed specifically for drug 
safety purposes. 

As the THIN and FAERS have different recording codes 
for the medical events and drug prescriptions we will combine 
the records using string matching as both databases contain 
the medical event descriptions and generic drug name strings. 

C. SIDER 

The Bradford Hill based framework for discovering adverse 
drug reactions requires training a classifier to distinguish be¬ 
tween adverse drug reactions and non-adverse drug reactions. 
To train such a classifier requires a training set of labelled 
data. This means we need to know a set of drug and health 
outcome pairs where the drug is known to cause the health 


outcome and a set of drug and health outcome pairs where the 
drug is known to not cause the health outcome. 

To find a set of drug and health outcomes where the 
drug is known to cause the health outcome we used the 
online side effect resource known as SIDER M . SIDER 
contains drug and health outcome classifications. A search 
can be implemented to find the set of health outcomes that 
are indications to a specific drug or known side effects. The 
authors used text mining to extract the drug packaging labelled 
adverse drug reactions and indications in addition to extracting 
information from public documents. SIDER uses the medDRA 
coding system. 

D. Non adverse events 

To find a set of drug and health outcome pairs where the 
drug does not cause the health outcome we identified health 
outcomes that do not correspond to an actual illness or cannot 
be caused by a drug acutely. This was accomplished due to the 
hierarchal nature of the READ codes. We found parent READ 
codes such as ‘family history’ or ‘cancer’ or ‘history of’ and 
selected all the child, grandchild or great grandchild READ 
codes. These READ codes were considered not possible to be 
an acute adverse event. Any drug and READ code pair where 
the READ code was from the set of Non adverse events was 
deemed impossible to correspond to adverse drug reaction and 
could therefore be classed as a non-adverse drug reaction. 

E. Combining the Data Sources 

The SIDER and FAERS data are readily combined as 
they use the medDRA coding system. Combining the THIN 
database presents a challenge as the medical events are 
recorded via the READ code system. In this work we com¬ 
bined THIN, FAERS and SIDER by exact non-case sensitive 
string matching. For each of the READ codes in THIN the 
corresponding description was matching with the medDRA 
code description. For example, if in THIN the READ code’s 
description was ’Vomiting’, then we matched this record 
with any SIDER and FAERS record with a medDRA code 
description of ’vomiting’. This may result in many unmatched 
THIN and FAERS/SIDER records that actually correspond to 
the same health outcome but have non-generic descriptions so 
the string descriptions are not exactly the same. 

F. Software 

The software used in this study was SQL to store and pre- 
process the data and the open software R m to perform the 
analysis. The classification was performed using the ‘caret’ 
library ED and the evaluation was performed using the 
‘pROC’ library 03. 

IV. Framework Incorporating Consistency 
A. Data Creation 

The Bradford Hill framework requires extracting features 
from the THIN and FAERS databases for a collection of drug 
and health outcome pairs that are known to correspond to 
adverse drug reactions (using SIDER) or cannot correspond 
to an adverse drug reaction (due to selecting health outcome 
having a clear non-drug cause). 






























1) Finding the labels: The first step is to find the drug 
and health outcome pairs where there seems to be a temporal 
association between the drug and health outcome in THIN and 
a true label is known. Given a selection of drugs, for each drug 
all the records of patients being prescribed the drug for the first 
time are extracted. A drug and READ code pair is created for 
each READ code that was recorded within a month of the 
first prescription of the drug for three or more patients. The 
set containing all these pairs is P = {pi}. For a drug and 
READ code pair pi £ P, we then calculate the number of 
prescriptions of the drug where the READ code occurred in 
the month before the drug, II,, and the number of prescriptions 
of the drug where the READ code occurred in the month after 
the drug, A\. All the drug and READ code pairs where the 
READ code occurred more often before the prescription were 
excluded, P = {pi £ P : Ai/Bi > 1}. The remaining drug 
and READ code pairs are the ones that appear to have an 
association in THIN. 


TABLE I. The contingency table often used for analysing 
SRS DATA SUCH AS FAERS. 


Drug n 
Other Drug 


Health outcome m 


Other Health outcome 


b 

d 


X 2 - The risk ratio comparing the patients prescribed 
the drug and prescribed any other drug. 

X 3 : The odds ratio comparing the patients prescribed 

the drug and prescribed any other drug. 

X 4 : The risk difference comparing the patients pre¬ 

scribed the drug and prescribed any other drug 
but with an additional prescription filter. The filter 
removed prescriptions from the THIN data of 
any drug where a drug from the same family 
was prescribed in the previous 12 months. The 
risk difference was then calculated on the filtered 
THIN data. 


Where possible these pairs are then labelled as correspond¬ 
ing to a known adverse drug reaction or non-adverse drug 
reaction. This was accomplished by labelling any pair with 
a READ code from the non adverse events set detailed in 
section [Ill-D as a non-adverse drug reaction. For the remaining 
unlabelled pairs, the READ code’s description was matched 
with the known SIDER listed adverse drug reactions of the 
drug and any pair with a match was labelled as a known 
adverse drug reaction. The unlabelled pairs were discarded. 
Formally, the label for Pi £ P is 


! 1 if pi is a known side effect on SIDER 
0 if the READ code of pi is 
not a possible adverse event 
— 1 the label is unknown 

(1) 

the drug and READ code pairs of interest are then, P = {pi £ 
P : yi > 0}. This resulted in a set of 8158 labelled drug and 
READ code pairs, with 733 labelled as known adverse drug 
reactions and 7425 labelled as non-adverse drug reactions. 


2) Extracting TFIIN features: For a labelled drug and 
READ code pair, pi, we extracted the association strength, 
temporality, specificity, experimentation and biological gradi¬ 
ent features from the THIN database. The extracted association 
strength features used various measures of risk. The risk of a 
READ code during a defined time period for a set of patients 
is simply the number of patients who experience the READ 
code during the define time period divided by the number of 
patients. The risk difference is the risk of the READ code 
during the month after the prescription for the one set of 
patients minus the risk of the READ code during the month 
after the prescription for a different set of patients. The risk 
ratio is the risk of the READ code during the month after 
the prescription for the one set of patients divided by the risk 
of the READ code during the month after the prescription 
for a different set of patients. The odds ratio is odd of the 
READ code occurring during the month after the prescription 
for the one set of patients divided by the odds of the READ 
code occurring during the month after the prescription for a 
different set of patients. The extracted features for the drug 
and READ code pair p, are; 


x±: The risk difference comparing the patients pre¬ 

scribed the drug and prescribed any other drug. 


The temporality feature, a; 5 , is Ai/Bi. The specificity 
features are: 


x§. the average age of the patients prescribed the 
drug who have the READ code recorded within a 
month of the prescription divided by the average 
age of the patients prescribed the drug. 

Xt: the gender ratio (males/females) of the patients 

prescribed the drug who have the READ code 
recorded within a month of the prescription di¬ 
vided by the gender ratio of the patients prescribed 
the drug. 

Xs- the READ code level (L(p's corresponding READ 

code). 

The biological feature, xg , is the average drug dosage only 
considering the patients prescribed the drug who have the 
READ code recorded within a month of the prescription 
divided by the average drug dosage when considering all the 
patients prescribed the drug. The experimentation feature, xio 
, calculates how many patients experience the READ code 
within a month after a prescription of the drug and not during 
the month before for two or more distinct prescriptions of the 
drug divided by the number of patients who have a distinct 
repeat prescription of the drug. 

3) Extracting consistency feature: To extract features cor¬ 
responding to the consistency consideration we calculated the 
measure of association between a drug and health outcome 
for each year of FAERS data. The risk difference was used to 
determine a measure of association for each year of FAERS 
data, using the values in a Contingency table, see Table |T| The 
risk difference calculation for drug n and health outcome m is 

RD mn = [a/ (a + b )] - [c/(c + d)} (2) 


The consistency feature, x\\ , was then calculated as the 
number yearly FAERS datasets where the drug and health 
outcome had a positive risk difference. For example, if the risk 
difference for a specific drug and health outcome was 0.4 when 
considering the 2010 FAERS data, 0.1 for the 2011 FAERS 
data, -0.05 for the 2012 FAERS data and the health outcome 
was not recorded with the drug in 2013, then X\\ = 2. 







TABLE H. 


The THIN and FAERS data were combined when the outcomes and drugs matched exactly. 


THIN Outcome 

THIN Drug 

FAERS Outcome 

FAERS Drug 

Match 

Nausea 

Ciprofloxacin 

NAUSEA 

Ciprofloxacin 

Yes 

CO Nausea 

Ciprofloxacin 

NAUSEA 

Ciprofloxacin 

No 

HO Nausea 

Ciprofloxacin 

Nausea 

Ciprofloxacin 

No 

Nausea 

Ciprofloxacin 

NAUSEA 

Cipro 

No 

Nausea NED 

Ciprofloxacin 

NAUSEA 

Ciprofloxacin 

No 


To combine the consistency feature for a drug and health 
outcome coded in medDRA with the THIN features we 
matched the READ code’s description string with the FAERS’s 
medDRA description string and the drug strings in THIN and 
FAERS. Table |II] illustrates the matching implemented. 

B. The complete data 

This resulted in a vector of features xi _<£ R 11 for each >, 
labelled drug and READ code pain pt £ P. Therefore the 
labelled data corresponding to pi £ P are X = {(x;. y t ) }. For % 
the 23 drugs investigated there were 8158 drug-READ code 
pairs that could be labelled, with 733 labelled as an adverse 
drug reaction. 

C. Evaluation 

The Bradford Hill framework’s classifier is evaluated by 
finding how often the classifier correctly classifies a drug and 
READ code pair as corresponding to an adverse drug reaction. 

The labelled data set, X = {(xi, y, )}, was partitioned into 80% 
training/testing Xt and 20% validation Xy. The classifier is 
trained on Xt using 10 -fold cross validation to learn a function 
/ : R 10 -A {0,1} that maps a drug and READ code pair’s 
Bradford Hill based extracted features into a class of adverse 
drug reaction or class of non-adverse drug reaction. 

The trained classifier is then applied to the extracted 
features of each drug and READ code pairs in the validation set 
to predict their classes, /(xi), (xj, yi) £ Xy and the prediction 
is compared with the truth. The classification is, 

• TP when /(xi) = 1 and y t = 1 

• TN when /(xj) = —1 and yi = —1 

• FP when /(xi) = 1 and yi = —1 

• FN when /(xi) = — 1 and yi = 1 

The sensitivity and specificity of the classifier are. 

Sensitivity = TP/(TP + FN) 

Specificity = TN/(FP + TN) 

The receiver operating characteristic, ROC, curve is then 
drawn by plotting the sensitivity against one minus the speci¬ 
ficity. A common measure of performance for a classifier is the 
area under the ROC curve (AUC) ED. As we are interested 
in a classifier that can identify adverse drug reactions without 
incorrectly classifying many non-adverse drug reactions, we 
also calculate the partial AUC between the specificity values 
0.8-1, denoted pAUCjo.s.i]- The AUC of two classifier can 
be compared using the Delong method and we use 

this technique to determine significant differences at a 5% 
significance level. 



Specificity 

Fig. 6. The ROC plots for the Bradford Hill framework classifier not including 
the consistency feature (red), the Bradford Hill framework classifier including 
the consistency feature (blue) and the number of years that the FAERS data 
had a positive risk difference for the drug and READ code pair (green). 

TABLE III. THE AUC VALUES FOR THE DIFFERENT CLASSIFIERS. 


Method 

Framework incorporating the consistency feature 

AUC 

0.967 

pAUC [o.8,i] 
0.1794 

Framework excluding the consistency feature 

The consistency feature alone 

0.923 

0.807 

0.1498 

0.1299 


V. Results & Discussion 

The ROC plots for the Bradford Hill framework’s classi¬ 
fier incorporating the consistency feature, the Bradford Hill 
framework’s classifier excluding the consistency feature and 
just using the consistency feature are presented in Fig. [6] 
The AUC and pAUC[o s tl values are displayed in Table 
m It can be seen that incorporating the consistency feature 
significantly increased the AUC, 0.967 compared to 0.923 
without the consistency feature (p-value 1.02 x 10” 5 ). This 
shows that incorporating the consistency feature increased 
the frameworks ability to detect adverse drug reactions. This 
results also suggests that performing analysis by combining 
different sources of data can lead to improved results in health 
informatics. 

The performance of just using the measure of consistency 
of an association between and drug and READ code pair 
over the years 2010-2014 within the FAERS data resulted 









TABLE IV. Consistency attribute distribution across the 
classes. 



Xu — 0 

xil = 1 

xn = 2 

xil = 3 

Xil = 4 

Vi 

= 0 

7391 

15 

6 

5 

8 

Vi 

= 1 

272 

68 

83 

111 

199 


in an AUC of 0.807. The plot shows that the measure of 
consistency was able to identify many known adverse drug 
reactions, with a high sensitivity when the specificity is also 
high. However, there is a point in the specificity where the 
measure of consistency is no longer able to identify adverse 
drug reactions. This shows that the FAERS data can be used to 
identify adverse drug reactions accurately but is limited in that 
it cannot identify all the adverse drug reactions. This highlights 
the requirement of performing analysis on the combination of 
longitudinal healthcare and SRS data to detect adverse drug 
reactions. 

The results suggest that the consistency feature extracted 
from the FAERS data is able to aid the classifier to detect ad¬ 
verse drug reactions that are not reported in the FAERS, as the 
framework incorporating the consistency feature outperformed 
the framework excluding the consistency feature and relying 
on the consistency alone. We suspected that the inclusion 
of consistency feature may bias the classifier due to strong 
correlation between the number of positive risk difference 
values across the years 2010-2014 and the drug and READ 
code pair corresponding to an adverse drug reaction. However, 
this was not the case, even though the consistency feature was 
highly skewed between the classes, see table [IV] Over half of 
the ADRs could be identified, with a small false positive rate, 
using the signalling criteria of i n > 2, however the THIN 
features were required to be able to signal the remaining ADRs 
that are reported less often in the FAERS data. 

One limitation of this research is the potential bias of 
the data combination and labelling. For example, the labels 
and consistency feature may highly correlated due to bias 
as the SIDER labels being derived from drug packaging and 
public documents that may have considered the SRS data. The 
reference set of known non-adverse drug reactions also caused 
a bias as it is very difficult to know whether a health outcome 
is definitely not an adverse drug reaction to a specific drug. 
The reference set drug and health outcomes corresponding to 
non-adverse drug reactions were selected due to the health 
outcome having a clear non-drug cause. Therefore the drug and 
health outcomes corresponding to non-adverse drug reactions 
in the reference set are extremely unlikely to be recorded as 
a suspected adverse drug reaction in the FAERS database. 
Both of these issue result in bias of the consistency attributes 
for the reference set used. As a consequence the trained 
classifier is likely to predict any drug and health outcome 
pair that is recorded in FAERS, and therefore probably likely 
to have a consistency feature value greater than 0, as an 
adverse drug reaction. However, many of the FAERS records 
may not correspond to an actual adverse drug reaction. In 
future work it is important to improve the reference set by 
including drug and health outcome pairs that are known to 
correspond to non-adverse drug reaction but are still plausible 
(i.e., include health outcomes that are common illnesses such 
as ‘vomiting’ or ‘rash’). Evaluating the framework on such a 
reference set will result in a less biased measure of how well 
the framework incorporating the Bradford Hill consistency 


consideration performs. 

The framework was also limited by the string matching 
between the THIN READ code descriptions and the medDRA 
descriptions. Many of the known adverse drug reactions may 
not be labelled in the data due to the READ code description 
slightly differing from the medDRA description and some of 
the drug and READ code pairs may have missing consistency 
feature values due to problems with the string matching. If a 
natural language processing method was developed for map¬ 
ping the READ code and medDRA description (or any medical 
terminology coding system) then it would enable different 
sources of data to be readily integrated and analysed together. 
This is likely to help researchers extract new knowledge. 

The framework incorporating the Bradford Hill association 
strength, temporality, consistency, specificity, biological gradi¬ 
ent and experimentation has a high performance but this may 
be increased by including the plausibility and coherence con¬ 
siderations. Other sources of data have been used to identify 
potential adverse drug reactions, including chemical structure 
data. It may be possible to combine more sources of data, 
such as chemical structure databases, to cover all the Bradford 
Hill considerations and developed a framework that can detect 
any adverse drug reaction with an even higher specificity and 
sensitivity. 

VI. Conclusion 

In this paper we have proposed a way to incorporate a 
measure of how consistent an association between and drug 
and health outcome is by combining different forms of drug 
safety data. This increased the existing Bradford Hill based 
causal inference framework’s ability to identify adverse drug 
reactions in longitudinal observational data. The results show 
that incorporating features derived from the FAERS database 
significantly improved the classifiers ability to distinguish 
between adverse drug reaction relationships and non-adverse 
drug reaction relationships. 

In future work a new reference set could be developed to 
evaluate the framework fairly. It would also be of interest to 
incorporate chemical structure databases to include features 
based on the plausibility and coherence Bradford Hill consid¬ 
erations. 
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